This is the analysis for the manuscript “The Voice Familiarity Benefit for Infant Phoneme Acquisition”.
Please note: the variable TestSpeaker (levels: 1, 4) in this analysis document is called Test Speaker (levels: Training Speaker and Novel Speaker) in the manuscript. TestSpeaker 1 corresponds to Training Speaker, and TestSpeaker4 corresponds to Novel Speaker (TestSpeaker4 corresponds to Speaker3 in the manuscript!). The variable Group (levels: fam, unfam) is called Training Speaker (levels: familiar, unfamiliar) in the accompanying manuscript.
In the accompanying preregistration we explain how we set the priors based on pilot data, and we run prior predictive predictive check, posterior predictive checks, and a sensitivity analysis on simulated data.
All the models in this R Markdown file are knit beforehand with the commented out code (for transparency), and are then read in (for efficiency).
source(here("code/R","00_prepare_data_exp.R"), local = knitr::knit_global())
summary(data_acq)
## Subj MMR Group TestSpeaker VoiceTrain
## 1 : 2 Min. :-19.6389 fam :69 S1:65 Length:134
## 2 : 2 1st Qu.: -3.2603 unfam:65 S4:69 Class :character
## 3 : 2 Median : 1.1963 Mode :character
## 4 : 2 Mean : 0.7285
## 5 : 2 3rd Qu.: 4.5056
## 7 : 2 Max. : 14.6783
## (Other):122
## TestOrder age sex nrSpeakersDaily
## Min. :1143 Min. :-2.2546 Length:134 Min. :-0.56285
## 1st Qu.:1143 1st Qu.:-0.5056 Class :character 1st Qu.:-0.44973
## Median :1413 Median : 0.1941 Mode :character Median :-0.21480
## Mean :1769 Mean : 0.0000 Mean : 0.00000
## 3rd Qu.:2243 3rd Qu.: 0.7479 3rd Qu.: 0.06363
## Max. :2423 Max. : 1.4767 Max. : 6.32845
##
## sleepState BlocksAsleep Lab daysVoicetraining
## asleep: 7 Length:134 Length:134 Min. :15.00
## awake :127 Class :character Class :character 1st Qu.:18.00
## Mode :character Mode :character Median :20.00
## Mean :20.37
## 3rd Qu.:21.00
## Max. :28.00
## NA's :2
## Anmerkungen CommentsProtokoll mumDist TrialsStan
## Length:134 Length:134 Min. :-1.4128 Min. :16.00
## Class :character Class :character 1st Qu.:-0.7368 1st Qu.:38.00
## Mode :character Mode :character Median :-0.3376 Median :50.00
## Mean : 0.0000 Mean :47.11
## 3rd Qu.: 0.6807 3rd Qu.:56.00
## Max. : 3.2713 Max. :70.00
##
## TrialsDev
## Min. :11.00
## 1st Qu.:19.00
## Median :24.50
## Mean :23.56
## 3rd Qu.:28.00
## Max. :39.00
##
We based our priors on pilot data, see preregistration:
normal(0,10)giving us:
priors <- c(set_prior("normal(3.5, 20)",
class = "Intercept"),
set_prior("normal(0, 10)",
class = "b"),
set_prior("normal(0, 20)",
class = "sigma"))
Let’s visualize the priors
Here, we check what happens if we run the model based on our priors only.
num_chains <- 4
num_iter <- 4000
num_warmup <- num_iter / 2
num_thin <- 1
priorpredcheck_acq <- brm(MMR ~ 1 + TestSpeaker * Group +
mumDist +
nrSpeakersDaily +
sleepState +
age +
(1 + TestSpeaker | Subj),
data = data_acq,
prior = priors,
family = gaussian(),
control = list(
adapt_delta = .99,
max_treedepth = 15
),
iter = num_iter,
chains = num_chains,
warmup = num_warmup,
thin = num_thin,
cores = num_chains,
seed = project_seed,
file = here("data", "model_output", "A0_model_priorpredcheck_acq.rds"),
file_refit = "on_change",
save_pars = save_pars(all = TRUE),
sample_prior = "only"
)
priorpredcheck_acq <- readRDS(here("data", "model_output", "A0_model_priorpredcheck_acq.rds"))
pp <- posterior_predict(priorpredcheck_acq)
pp <- t(pp)
# distribution of mean MMR
meanMMR = colMeans(pp)
hist(meanMMR, breaks = 40)
summary(priorpredcheck_acq)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: MMR ~ 1 + TestSpeaker * Group + mumDist + nrSpeakersDaily + sleepState + age + (1 + TestSpeaker | Subj)
## Data: data_acq (Number of observations: 134)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Group-Level Effects:
## ~Subj (Number of levels: 71)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 6.29 8.53 0.16 24.15 1.00 9709
## sd(TestSpeaker1) 6.12 7.09 0.15 24.60 1.00 8891
## cor(Intercept,TestSpeaker1) 0.00 0.57 -0.94 0.94 1.00 12369
## Tail_ESS
## sd(Intercept) 3539
## sd(TestSpeaker1) 3690
## cor(Intercept,TestSpeaker1) 4873
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 3.22 20.45 -36.51 43.53 1.00 13404 5979
## TestSpeaker1 0.08 9.90 -19.48 19.65 1.00 13737 5961
## Group1 -0.02 10.09 -19.63 19.98 1.00 15720 6104
## mumDist 0.14 9.87 -19.03 19.58 1.00 14631 5940
## nrSpeakersDaily -0.13 10.24 -20.64 19.88 1.00 14969 5387
## sleepState1 0.07 9.97 -19.29 19.69 1.00 13728 6084
## age -0.12 10.03 -19.82 19.31 1.00 14171 5993
## TestSpeaker1:Group1 -0.10 9.90 -19.49 19.57 1.00 13341 5574
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 15.78 12.14 0.55 44.76 1.00 6140 3458
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
In our priors, we specified a mean of 3.5 and an SD of 20. In these
plots and the model summary, we see that the mean of the MMR is
estimated at 3.22 and is normally distributed, with an estimated error
of 20.45. Sigma was specified at normal(0,20) - the
estimate of 15.78 with an estimated error of 12.14 seems reasonable. The
effects on the slope were specified at normal(0,10) which
also seems to be reasonably estimated.
Conclusion: our priors seem to be reasonable.
To check how dependent the model’s estimates are on the priors, we
perform a sensitivity analysis. We will check the estimates of our model
with our proposed priors (intercept: normal(3.5,20), slope
normal(0,10), sigma normal(0,20)), and three
alternative priors:
normal(0,20), slope
normal(0,10): the same SD but the a mean of zeronormal(0,30), slope
normal(0,20): weakly informative, because of the large
SDnormal(0,50), slope
normal(0,40): uninformative, not really biologically
plausible anymorepriors_orig <-
c(set_prior("normal(3.5, 20)",
class = "Intercept"),
set_prior("normal(0, 10)",
class = "b"),
set_prior("normal(0, 20)",
class = "sigma"))
priors1 <-
c(set_prior("normal(0, 20)",
class = "Intercept"),
set_prior("normal(0, 10)",
class = "b"),
set_prior("normal(0, 20)",
class = "sigma"))
priors2 <-
c(set_prior("normal(0, 30)",
class = "Intercept"),
set_prior("normal(0, 20)",
class = "b"),
set_prior("normal(0, 30)",
class = "sigma"))
priors3 <-
c(set_prior("normal(0, 50)",
class = "Intercept"),
set_prior("normal(0, 40)",
class = "b"),
set_prior("normal(0, 50)",
class = "sigma"))
Let’s plot these different priors for the intercept:
We ran the model of the posterior checks (see above) with these three alternative priors, such that we can compare the different estimates:
m_orig <- readRDS(here("data", "model_output", "A1_model_sensitivity_acq.rds"))
m_alt_1 <- readRDS(here("data", "model_output", "A1_model_sensitivity_altern1_acq.rds"))
m_alt_2 <- readRDS(here("data", "model_output", "A1_model_sensitivity_altern2_acq.rds"))
m_alt_3 <- readRDS(here("data", "model_output", "A1_model_sensitivity_altern3_acq.rds"))
posterior_summary(m_orig, variable=c("b_Intercept","b_TestSpeaker1", "b_Group1", "sigma"))
## Estimate Est.Error Q2.5 Q97.5
## b_Intercept 1.374901 1.2627675 -1.109432 3.8480191
## b_TestSpeaker1 1.248532 1.1638900 -1.023524 3.5427951
## b_Group1 -1.519245 1.1270864 -3.731797 0.6912292
## sigma 5.312684 0.9310486 2.558139 6.5869484
posterior_summary(m_alt_1, variable=c("b_Intercept","b_TestSpeaker1", "b_Group1", "sigma"))
## Estimate Est.Error Q2.5 Q97.5
## b_Intercept 1.382945 1.2664752 -1.112394 3.8571511
## b_TestSpeaker1 1.250782 1.1665207 -1.025833 3.5531056
## b_Group1 -1.518124 1.1246155 -3.726675 0.6913231
## sigma 5.307165 0.9248109 2.604259 6.5843626
posterior_summary(m_alt_2, variable=c("b_Intercept","b_TestSpeaker1", "b_Group1", "sigma"))
## Estimate Est.Error Q2.5 Q97.5
## b_Intercept 1.418797 1.2960256 -1.121502 3.9510215
## b_TestSpeaker1 1.254365 1.1734003 -1.045901 3.5732266
## b_Group1 -1.529477 1.1252047 -3.741420 0.6800477
## sigma 5.307281 0.9580654 2.261416 6.5856938
posterior_summary(m_alt_3, variable=c("b_Intercept","b_TestSpeaker1", "b_Group1", "sigma"))
## Estimate Est.Error Q2.5 Q97.5
## b_Intercept 1.416889 1.3057424 -1.132296 3.9812142
## b_TestSpeaker1 1.249027 1.1809833 -1.066612 3.5717876
## b_Group1 -1.537466 1.1314584 -3.754977 0.6952199
## sigma 5.366639 0.8486062 2.997306 6.5971251
We can now also plot different estimates for the different priors:
The alternative models have 627, 519, and 222 divergent transitions, respectively. However, for all of these models, Rhat is always 1.00 and ESS are high enough (lowest is a Tail_EES of 1275). Moreover, the estimates seem to be reliable for all models (since they are very similar).
As we can see from the (plotted) estimates, the different priors barely have any effect on the estimates of the model. This means that our model is not overly sensitive to the priors.
We set the contrast such that we get equal priors on the different comparisons. This is important for our Bayes Factor analysis later on. However, since we only have factors with max. 2 levels, this is the same as deviation coding with 0.5 and -0.5
contrasts(data_acq$TestSpeaker) <- contr.equalprior_pairs
contrasts(data_acq$Group) <- contr.equalprior_pairs
contrasts(data_acq$sleepState) <- contr.equalprior_pairs
contrasts(data_acq$TestSpeaker)
## [,1]
## S1 -0.5
## S4 0.5
contrasts(data_acq$Group)
## [,1]
## fam -0.5
## unfam 0.5
contrasts(data_acq$sleepState)
## [,1]
## asleep -0.5
## awake 0.5
We also scaled the continuous predictors.
data_acq$mumDist<- scale(data_acq$mumDist)
data_acq$age <- scale(data_acq$age)
data_acq$nrSpeakersDaily <- scale(data_acq$nrSpeakersDaily)
We run our model and have a look:
num_chains <- 4 # number of chains = number of processor cores
num_iter <- 80000 # number of samples per chain: because I use Savage-Dickey for hypothesis testing, so we need a LOT of samples
num_warmup <- num_iter / 2 # number of warm-up samples per chain
num_thin <- 1 # thinning: extract one out of x samples per chain
model_acq = brm(MMR ~ 1 + TestSpeaker * Group +
mumDist +
nrSpeakersDaily +
sleepState +
age +
(1 + TestSpeaker | Subj),
data = data_acq,
prior = priors,
family = gaussian(),
control = list(
adapt_delta = .99,
max_treedepth = 15
),
iter = num_iter,
chains = num_chains,
warmup = num_warmup,
thin = num_thin,
cores = num_chains,
seed = project_seed,
file = "A2_model_acq.rds",
file_refit = "on_change",
save_pars = save_pars(all = TRUE)
)
model_acq <- readRDS(here("data", "model_output", "A2_model_acq.rds"))
summary(model_acq)
## Warning: There were 85 divergent transitions after
## warmup. Increasing adapt_delta above 0.99 may help. See
## http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: MMR ~ 1 + TestSpeaker * Group + mumDist + nrSpeakersDaily + sleepState + age + (1 + TestSpeaker | Subj)
## Data: data_acq (Number of observations: 134)
## Draws: 4 chains, each with iter = 80000; warmup = 40000; thin = 1;
## total post-warmup draws = 160000
##
## Group-Level Effects:
## ~Subj (Number of levels: 71)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 2.33 1.15 0.18 4.52 1.00 3330
## sd(TestSpeaker1) 2.67 1.94 0.11 7.29 1.00 2905
## cor(Intercept,TestSpeaker1) 0.22 0.49 -0.86 0.96 1.00 83256
## Tail_ESS
## sd(Intercept) 1476
## sd(TestSpeaker1) 964
## cor(Intercept,TestSpeaker1) 78147
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## Intercept 1.37 1.26 -1.11 3.85 1.00 147304 114909
## TestSpeaker1 1.25 1.16 -1.02 3.54 1.00 146717 120928
## Group1 -1.52 1.13 -3.73 0.69 1.00 133969 111611
## mumDist -0.40 0.64 -1.65 0.85 1.00 117476 114518
## nrSpeakersDaily 0.41 0.57 -0.72 1.53 1.00 147764 114850
## sleepState1 -1.46 2.52 -6.41 3.47 1.00 142840 119363
## age 0.24 0.57 -0.88 1.36 1.00 143964 114495
## TestSpeaker1:Group1 -4.99 2.00 -8.90 -1.04 1.00 188757 114174
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 5.31 0.93 2.56 6.59 1.00 2291 798
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
There were 85 divergent transitions after warmup. However, the model seems to have converged, since the Rhat and ESS values are good enough (cf. https://mc-stan.org/rstan/reference/Rhat.html).
We will plot the traces, to check model convergence:
plot(model_acq, ask = FALSE)
Here we see that the traces all look like nice hairy caterpillars,
meaning that the chains have mixed (e.g., https://nicholasrjenkins.science/post/bda_in_r/bayes_with_brms/).
The posterior distributions also look good.
Now we check whether the model is able to retrieve the underlying data. y is the observed data, so the data that we inputted, and y’ is the simulated data from the posterior predictive distribution.
pp_check(model_acq, ndraws=50)
This looks good: the data overlap.
We now check the posterior samples of the posterior predictive distribution, for the estimates of S1_fam, S2_fam, S1_unfam, and S2_unfam.
data_acq <-
data_acq %>%
unite( # unite columns for posterior predictive checks
# unites the two columns TestSpeaker and Group Because brms made a posterior distribution
# with TestSpeaker_Group, because it looks at an interaction.
"TestSpeaker_Group",
c("TestSpeaker", "Group"),
sep = "_",
remove = FALSE,
na.rm = FALSE
) %>%
select(Subj, TestSpeaker_Group, TestSpeaker, Group, MMR)
posterior_predict_model_acq <-
model_acq %>%
posterior_predict(ndraws = 2000)
PPS_acq <-
posterior_predict_model_acq %>%
ppc_stat_grouped(
y = pull(data_acq, MMR),
group = pull(data_acq, TestSpeaker_Group),
stat = "mean"
) +
ggtitle("Posterior predictive samples")
PPS_acq
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
y is the mean of the data that we put in the model. Yrep is the
posterior predicted samples from our posterior distribution. We see that
our actual mean is in the middle of the predicted distribution, and that
the predicted distribution is a normal distribution, which is what we
expect.
Now we want to test our hypotheses. Our main research question is: Is there a Voice Familiarity Benefit for phoneme acquisition in infants? We address three sub-questions:
Hypothesis 1 (directional):
We expect that
infants discriminate a vowel contrast better from the training voice
when this speaker’s voice was familiar during the training, shown by a
more negative mismatch response (MMR) in the FamTrain group than in the
UnFamTrain group condition in the TrainVoiceTest (speaker 1).
Hypothesis 2 (non-directional):
We expect a
difference between the FamTrain group and the UnFamTrain group on the
NovelVoiceTest (speaker 4).
Hypothesis 3
(directional):
We expect to find that infants
discriminate the vowel contrast better in the TrainVoiceTest than in the
NovelVoiceTest, shown by a more negative MMR in TrainVoiceTest (speaker
1) than in the NovelVoiceTest (speaker 4), irrespective of training
group.
Hypothesis 4 (non-directional):
We
expect a relationship between number of speakers exposed to in daily
life and vowel discrimination. This relationship could be either
positive or negative (for a positive relation between speakers exposed
to in daily life and vowel discrimination, we would expect a negative
relationship between speakers exposed to in daily life and MMR, since we
assume that the more negative the MMR is, the better the
discrimination).
We thus want the following comparisons 1. For TestSpeaker = 1, the
MMR is more negative for group=fam vs group = unfam.
2. For
TestSpeaker = 4, the MMR is different for group=fam vs group =
unfam.
3. For both groups together: MMR is larger for TestSpeaker
= 1 as for TestSpeaker = 4.
4. For both groups and speakers
together: we expect a correlation between MMR and nrSpeakersDaily.
This means we want the following contrasts:
for
Group:TestSpeaker:
RQ1: Groupfam TestSpeaker1 - Groupunfam
TestSpeaker1.
RQ2: Groupfam TestSpeaker4 - Groupunfam
TestSpeaker4.
for TestSpeaker:
RQ3: TestSpeaker1 -
TestSpeaker4
general:
RQ4: nrSpeakersDaily
For our analyses we use Bayes Factors (BFs). However, as a robustness check, and to be able to compare our results with other experiments that use frequentist methods, we will also compute MAP-Based p-Value (pMAP; https://easystats.github.io/bayestestR/reference/p_map.html). This is the Bayesian equivalent of the p-value, and is related to the odds that a parameter (described by its posterior distribution) has against the null hypothesis (h0) using Mills’ (2014, 2017) Objective Bayesian Hypothesis Testing framework. It corresponds to the density value at 0 divided by the density at the Maximum A Posteriori (MAP). Caveats the p_MAP: 1) p_MAP allows to assess the presence of an effect, not its magnitude or importance (https://easystats.github.io/bayestestR/articles/probability_of_direction.html) 2) p_MAP is sensitive only to the amount of evidence for the alternative hypothesis (i.e., when an effect is truly present). It is not able to reflect the amount of evidence in favor of the null hypothesis. A high value suggests that the effect exists, but a low value indicates uncertainty regarding its existence (but not certainty that it is non-existent) (https://doi.org/10.3389/fpsyg.2019.02767).
Bayes Factors (https://doi.org/10.3389/fpsyg.2019.02767). The Bayes factor (BF) indicates the degree by which the mass of the posterior distribution has shifted further away from or closer to the null value (0) relative to the prior distribution, thus indicating if the null hypothesis has become less or more likely given the observed data. The BF is computed as a Savage-Dickey density ratio, which is also an approximation of a Bayes factor comparing the marginal likelihoods of the model against a model in which the tested parameter has been restricted to the point-null (Wagenmakers et al., 2010).
We used the following code to compute the pMAP:
## Effect all ------------------------
pMAP_all_acq =
model_acq %>%
p_map() %>%
mutate(
"p < .05" = ifelse(p_MAP < .05, "*", "")
)
## Mean MMR per subset --------------------------
pMAP_subsetMMR_acq <- model_acq %>%
emmeans(~ Group * TestSpeaker) %>%
p_map() %>%
mutate("p < .05" = ifelse(p_MAP < .05, "*", ""))
## Simple Effect Group ------------------------
pMAP_Group_simple_acq =
model_acq %>%
emmeans(~ Group|TestSpeaker) %>%
pairs() %>%
p_map() %>%
mutate(
"p < .05" = ifelse(p_MAP < .05, "*", "")
)
pMAP_Group_simple_acq
## Effect Speaker ------------------------
pMAP_Speaker_acq =
model_acq %>%
emmeans(~ TestSpeaker) %>%
pairs() %>%
p_map() %>%
mutate(
"p < .05" = ifelse(p_MAP < .05, "*", "")
)
pMAP_Speaker_acq
We used the following code to compute the Bayes Factors:
model_acq_prior <- unupdate(model_acq) # sample priors from model
## Effect all ------------------------
# Bayes Factors (Savage-Dickey density ratio)
BF_all_acq <-
model_acq %>%
bf_parameters(prior = model_acq_prior) %>%
arrange(log_BF) # sort according to BF
# add rule-of-thumb interpretation
BF_all_acq <-
BF_all_acq %>%
add_column(
"interpretation" = interpret_bf(
BF_all_acq$log_BF,
rules = "raftery1995",
log = TRUE,
include_value = TRUE,
protect_ratio = TRUE,
exact = TRUE
),
.after = "log_BF"
)
## Mean MMR per subset ------------------
subsetMMR_prior_acq <- model_acq_prior %>%
emmeans(~ Group * TestSpeaker)
subsetMMR_acq <- model_acq %>%
emmeans(~ Group * TestSpeaker)
BF_subsetMMR_acq <- subsetMMR_acq %>%
bf_parameters(prior = subsetMMR_prior_acq) %>%
arrange(log_BF) %>%
add_column("interpretation" = interpret_bf(
.$log_BF,
rules = "raftery1995",
log = TRUE,
include_value = TRUE,
protect_ratio = TRUE,
exact = TRUE
), .after = "log_BF")
## Simple Effect Group ------------------------
# pairwise comparisons of prior distributions
Group_simple_pairwise_prior_acq =
model_acq_prior %>%
emmeans(~ Group|TestSpeaker) %>% # estimated marginal means
pairs() # pairwise comparisons
# pairwise comparisons of posterior distributions
Group_simple_pairwise_acq <-
model_acq %>%
emmeans(~ Group|TestSpeaker) %>%
pairs()
# Bayes Factors (Savage-Dickey density ratio)
BF_Group_simple_acq <-
Group_simple_pairwise_acq %>%
bf_parameters(prior = Group_simple_pairwise_prior_acq) %>%
arrange(log_BF) # sort according to BF
# add rule-of-thumb interpretation
BF_Group_simple_acq <-
BF_Group_simple_acq %>%
add_column(
"interpretation" = interpret_bf(
BF_Group_simple_acq$log_BF,
rules = "raftery1995",
log = TRUE,
include_value = TRUE,
protect_ratio = TRUE,
exact = TRUE
),
.after = "log_BF"
)
BF_Group_simple_acq
## Effect Speaker ------------------------
# pairwise comparisons of prior distributions
Speaker_pairwise_prior_acq <-
model_acq_prior %>%
emmeans(~ TestSpeaker) %>% # estimated marginal means
pairs() # pairwise comparisons
# pairwise comparisons of posterior distributions
Speaker_pairwise_acq <-
model_acq %>%
emmeans(~ TestSpeaker) %>%
pairs()
# Bayes Factors (Savage-Dickey density ratio)
BF_Speaker_acq <-
Speaker_pairwise_acq %>%
bf_parameters(prior = Speaker_pairwise_prior_acq) %>%
arrange(log_BF) # sort according to BF
# add rule-of-thumb interpretation
BF_Speaker_acq <-
BF_Speaker_acq %>%
add_column(
"interpretation" = interpret_bf(
BF_Speaker_acq$log_BF,
rules = "raftery1995",
log = TRUE,
include_value = TRUE,
protect_ratio = TRUE,
exact = TRUE
),
.after = "log_BF"
)
BF_Speaker_acq
# Merge and save results ---------------------------------------------------------
pMAP_all_acq <- subset(pMAP_all_acq, select = -c(Effects, Component))
BF_all_acq <- subset(BF_all_acq, select = -c(Effects, Component))
pMAP_BF_all <- full_join(pMAP_all_acq, BF_all_acq, by = "Parameter") %>%
as_tibble()
pMAP_BF_subsetMMR <- full_join(pMAP_subsetMMR_acq, BF_subsetMMR_acq, by = "Parameter") %>%
as_tibble()
pMAP_BF_Group <- full_join(pMAP_Group_simple_acq, BF_Group_simple_acq, by = "Parameter") %>%
as_tibble()
pMAP_BF_Speaker <- full_join(pMAP_Speaker_acq, BF_Speaker_acq, by = "Parameter") %>%
as_tibble()
pMAP_BF_all <- rbind(pMAP_BF_all, pMAP_BF_subsetMMR, pMAP_BF_Group, pMAP_BF_Speaker)
# Save the results
saveRDS(pMAP_BF_all, file = here("data", "hypothesis_testing", "pMAP_BF_acq.rds"))
Let’s have a look at the output:
pMAP_BF_acq <- readRDS(here("data", "hypothesis_testing", "pMAP_BF_acq.rds"))
pMAP_BF_acq
## # A tibble: 15 × 5
## Parameter p_MAP `p < .05` log_BF interpretation
## <chr> <dbl> <chr> <dbl> <chr>
## 1 b_Intercept 0.538 "" -2.20 positive evidence (BF = 1/9.06…
## 2 b_TestSpeaker1 0.566 "" -1.59 positive evidence (BF = 1/4.89…
## 3 b_Group1 0.397 "" -1.25 positive evidence (BF = 1/3.51…
## 4 b_mumDist 0.812 "" -2.55 positive evidence (BF = 1/12.8…
## 5 b_nrSpeakersDaily 0.757 "" -2.61 positive evidence (BF = 1/13.5…
## 6 b_sleepState1 0.852 "" -1.21 positive evidence (BF = 1/3.35…
## 7 b_age 0.923 "" -2.78 positive evidence (BF = 1/16.1…
## 8 b_TestSpeaker1:Group1 0.0470 "*" 1.46 positive evidence (BF = 4.31) …
## 9 fam, S1 0.985 "" -2.61 positive evidence (BF = 1/13.6…
## 10 unfam, S1 0.754 "" -2.31 positive evidence (BF = 1/10.0…
## 11 fam, S4 0.0254 "*" 1.01 weak evidence (BF = 2.75) in f…
## 12 unfam, S4 1.00 "" -2.66 positive evidence (BF = 1/14.2…
## 13 fam - unfam, S1 0.805 "" -1.80 positive evidence (BF = 1/6.06…
## 14 fam - unfam, S4 0.0332 "*" 1.41 positive evidence (BF = 4.09) …
## 15 S1 - S4 0.566 "" -1.60 positive evidence (BF = 1/4.93…
Regarding our hypotheses, we find an effect of Group for S4.
We check whether the effects of Group and TestSpeaker are driven by a certain level sleepState, which does not seem to be the case.
emmip(model_acq, TestSpeaker ~ Group | sleepState)
emmip(model_acq, Group ~ TestSpeaker | sleepState)
As a last step, we will perform a sensitivity analysis for the BFs for the significant effect of Group for S4, to check in how far our BFs are dependent or our estimated effect size in our prior.
This is the code we used:
num_chains <- 4
num_iter <- 80000
num_warmup <- num_iter / 2
num_thin <- 1
# Run the model with different sds
prior_sd <- c(1, 5, 8, 10, 15, 20, 30, 40, 50)
#Run the models
for (i in 1:length(prior_sd)) {
psd <- prior_sd[i]
fit <- brm(MMR ~ 1 + TestSpeaker * Group +
mumDist +
nrSpeakersDaily +
sleepState +
age +
(1 + TestSpeaker | Subj),
data = data_acq,
prior = c(
set_prior("normal(3.5, 20)", class = "Intercept"),
set_prior(paste0("normal(0,", psd, ")"), class = "b"),
set_prior("normal(0, 20)", class = "sigma")
),
family = gaussian(),
control = list(
adapt_delta = .99,
max_treedepth = 15
),
iter = num_iter,
chains = num_chains,
warmup = num_warmup,
thin = num_thin,
cores = num_chains,
seed = project_seed,
save_pars = save_pars(all = TRUE),
file=here("data", "sensitivity_analysis", paste0("A5_sensAnal_BF_priorsd_", psd, ".rds"))
)
}
## Simple effect Group for speaker 4 -------------------------
BF <- c()
for (i in 1:length(prior_sd)) {
psd <- prior_sd[i]
fit <- readRDS(here("data", "sensitivity_analysis", paste0("A5_sensAnal_BF_priorsd_", psd, ".rds")))
fit_priors <- unupdate(fit)
m_prior <- fit_priors %>%
emmeans(~ Group | TestSpeaker) %>%
pairs()
m_post <- fit %>%
emmeans(~ Group | TestSpeaker) %>%
pairs()
BF_current <- bf_parameters(m_post, prior = m_prior) %>%
filter(Parameter == "fam - unfam, S4")
BF_current <- as.numeric(BF_current)
BF <- c(BF, BF_current)
}
res <- data.frame(prior_sd, BF, logBF = log10(BF))
save(res, file = here("data", "sensitivity_analysis", "A5_sensAnal_BF_simpleGroup-S4.rda"))
Let’s have a look at the plot:
load(here("data", "sensitivity_analysis","A5_sensAnal_BF_simpleGroup-S4.rda"))
res_groupS2_acq = res
prior_sd <- c(1, 5, 8, 10, 15, 20, 30, 40, 50)
breaks <- c(1 / 100, 1 / 50, 1 / 20, 1 / 10,1 / 5, 1 / 3,1, 3, 5, 10, 20, 50, 100)
p = ggplot(res, aes(x = prior_sd, y = BF)) +
geom_point(size = 2) +
geom_line() +
geom_hline(yintercept = 1, linetype = "dashed") +
scale_x_continuous("Normal prior width (SD)\n") +
scale_y_log10(expression("BF"[10]), breaks = breaks, labels = MASS::fractions(breaks)) +
coord_cartesian(ylim = c(1 / 100, 100), xlim = c(0, tail(prior_sd, n = 1))) +
annotate("text", x = 40, y = 30, label = expression("Evidence in favor of H"[1]), size = 5) +
annotate("text", x = 40, y = 1 / 30, label = expression("Evidence in favor of H"[0]), size = 5) +
theme(axis.text.y = element_text(size = 8)) +
theme_apa()
p
## Warning in is.na(x): is.na() applied to non-(list or vector) of type
## 'expression'
## Warning in is.na(x): is.na() applied to non-(list or vector) of type
## 'expression'
What we expect to see is that if we assume a very low effect size (SD is low), we have support for H1, but if we assume a very big effect size (SD is very high) we find support for the H0. This is because if we assume a very low effect size (i.e., a narrow prior with low standard deviation), we are essentially saying that we expect the true effect size to be close to zero. In this case, if the observed data shows an effect that is consistent with this narrow prior, we will likely find support for the alternative hypothesis (H1). On the other hand, if we assume a very high effect size (i.e., a broad prior with high standard deviation), we are indicating that we expect a wide range of possible effect sizes, including very large ones. In this case, if the observed data does not show an effect size as large as the broad prior suggests, the Bayes Factor may favor the null hypothesis (H0), as the data is not providing strong evidence for such a large effect. Indeed, this plot shows us exactly that: the higher the sd for the prior, the less evidence we find for our H1. However, this effect appears to be robust, since even with a very large sd of the prior (of 50), we still don’t find evidence for the H0.
In this exploratory analysis, we test what’s going on in the TW from 140 ms after stimulus onset (the offset of the vowel) until 334 ms (the start of our preset TW).
This is an analysis in
summary(data_acq_earlyTW)
## Subj MMR Group TestSpeaker VoiceTrain
## 1 : 2 Min. :-26.9505 fam :69 S1:65 Length:134
## 2 : 2 1st Qu.: -3.9642 unfam:65 S4:69 Class :character
## 3 : 2 Median : -0.5216 Mode :character
## 4 : 2 Mean : -0.8815
## 5 : 2 3rd Qu.: 2.4613
## 7 : 2 Max. : 19.4828
## (Other):122
## TestOrder age sex nrSpeakersDaily
## Min. :1143 Min. :-2.2546 Length:134 Min. :-0.56285
## 1st Qu.:1143 1st Qu.:-0.5056 Class :character 1st Qu.:-0.44973
## Median :1413 Median : 0.1941 Mode :character Median :-0.21480
## Mean :1769 Mean : 0.0000 Mean : 0.00000
## 3rd Qu.:2243 3rd Qu.: 0.7479 3rd Qu.: 0.06363
## Max. :2423 Max. : 1.4767 Max. : 6.32845
##
## sleepState BlocksAsleep Lab daysVoicetraining
## asleep: 7 Length:134 Length:134 Min. :15.00
## awake :127 Class :character Class :character 1st Qu.:18.00
## Mode :character Mode :character Median :20.00
## Mean :20.37
## 3rd Qu.:21.00
## Max. :28.00
## NA's :2
## Anmerkungen CommentsProtokoll mumDist TrialsStan
## Length:134 Length:134 Min. :-1.4128 Min. :16.00
## Class :character Class :character 1st Qu.:-0.7368 1st Qu.:38.00
## Mode :character Mode :character Median :-0.3376 Median :50.00
## Mean : 0.0000 Mean :47.11
## 3rd Qu.: 0.6807 3rd Qu.:56.00
## Max. : 3.2713 Max. :70.00
##
## TrialsDev
## Min. :11.00
## 1st Qu.:19.00
## Median :24.50
## Mean :23.56
## 3rd Qu.:28.00
## Max. :39.00
##
Here, we check what happens if we run the model based on our priors only.
num_chains <- 4
num_iter <- 4000
num_warmup <- num_iter / 2
num_thin <- 1
priorpredcheck_acq_earlyTW <- brm(MMR ~ 1 + TestSpeaker * Group +
mumDist +
nrSpeakersDaily +
sleepState +
age +
(1 + TestSpeaker | Subj),
data = data_acq_earlyTW,
prior = priors,
family = gaussian(),
control = list(
adapt_delta = .99,
max_treedepth = 15
),
iter = num_iter,
chains = num_chains,
warmup = num_warmup,
thin = num_thin,
cores = num_chains,
seed = project_seed,
file = here("data", "model_output", "A0_model_priorpredcheck_acq_earlyTW.rds"),
file_refit = "on_change",
save_pars = save_pars(all = TRUE),
sample_prior = "only"
)
priorpredcheck_acq_earlyTW <- readRDS(here("data", "model_output", "E0_model_priorpredcheck_acq_earlyTW.rds"))
pp <- posterior_predict(priorpredcheck_acq_earlyTW)
pp <- t(pp)
# distribution of mean MMR
meanMMR = colMeans(pp)
hist(meanMMR, breaks = 40)
summary(priorpredcheck_acq_earlyTW)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: MMR ~ 1 + TestSpeaker * Group + mumDist + nrSpeakersDaily + sleepState + age + (1 + TestSpeaker | Subj)
## Data: data_acq_earlyTW (Number of observations: 134)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Group-Level Effects:
## ~Subj (Number of levels: 71)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 5.40 6.03 0.21 20.66 1.00 9824
## sd(TestSpeakerS4) 5.54 6.48 0.17 21.38 1.00 10103
## cor(Intercept,TestSpeakerS4) -0.00 0.58 -0.95 0.95 1.00 14169
## Tail_ESS
## sd(Intercept) 4055
## sd(TestSpeakerS4) 4251
## cor(Intercept,TestSpeakerS4) 4608
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 3.05 23.40 -43.08 49.45 1.00 14922
## TestSpeakerS4 0.19 10.13 -19.70 20.21 1.00 15734
## Groupunfam 0.01 10.07 -19.90 20.11 1.00 15230
## mumDist 0.08 9.91 -19.29 20.07 1.00 13780
## nrSpeakersDaily 0.18 10.08 -19.70 20.14 1.00 16390
## sleepStateawake 0.13 10.22 -19.89 20.66 1.00 16016
## age -0.08 10.10 -19.64 19.38 1.00 15773
## TestSpeakerS4:Groupunfam 0.08 9.97 -19.45 19.95 1.00 15338
## Tail_ESS
## Intercept 5552
## TestSpeakerS4 5920
## Groupunfam 5828
## mumDist 5690
## nrSpeakersDaily 5444
## sleepStateawake 5602
## age 6118
## TestSpeakerS4:Groupunfam 5306
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 15.97 12.00 0.65 44.28 1.00 5986 3819
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
In our priors, we specified a mean of 3.5 and an SD of 20. In these
plots and the model summary, we see that the mean of the MMR is
estimated at 3.05 and is normally distributed, with an estimated error
of 23.40. Sigma was specified at normal(0,20) - the
estimate of 15.97 with an estimated error of 12.00 seems reasonable. The
effects on the slope were specified at normal(0,10) which
also seems to be reasonably estimated.
Conclusion: our priors seem to be reasonable, also for this TW.
As above, we set the contrasts and scaled the predictors
contrasts(data_acq_earlyTW$TestSpeaker) <- contr.equalprior_pairs
contrasts(data_acq_earlyTW$Group) <- contr.equalprior_pairs
contrasts(data_acq_earlyTW$sleepState) <- contr.equalprior_pairs
data_acq_earlyTW$mumDist<- scale(data_acq_earlyTW$mumDist)
data_acq_earlyTW$age <- scale(data_acq_earlyTW$age)
data_acq_earlyTW$nrSpeakersDaily <- scale(data_acq_earlyTW$nrSpeakersDaily)
We run our model and have a look:
num_chains <- 4 # number of chains = number of processor cores
num_iter <- 80000 # number of samples per chain: because I use Savage-Dickey for hypothesis testing, so we need a LOT of samples
num_warmup <- num_iter / 2 # number of warm-up samples per chain
num_thin <- 1 # thinning: extract one out of x samples per chain
model_acq_earlyTW = brm(MMR ~ 1 + TestSpeaker * Group +
mumDist +
nrSpeakersDaily +
sleepState +
age +
(1 + TestSpeaker | Subj),
data = data_acq_earlyTW,
prior = priors,
family = gaussian(),
control = list(
adapt_delta = .99,
max_treedepth = 15
),
iter = num_iter,
chains = num_chains,
warmup = num_warmup,
thin = num_thin,
cores = num_chains,
seed = project_seed,
file = "A2_model_acq_earlyTW.rds",
file_refit = "on_change",
save_pars = save_pars(all = TRUE)
)
model_acq_earlyTW <- readRDS(here("data", "model_output", "E2_model_acq_earlyTW.rds"))
summary(model_acq_earlyTW)
## Family: gaussian
## Links: mu = identity; sigma = identity
## Formula: MMR ~ 1 + TestSpeaker * Group + mumDist + nrSpeakersDaily + sleepState + age + (1 + TestSpeaker | Subj)
## Data: data_acq_earlyTW (Number of observations: 134)
## Draws: 4 chains, each with iter = 4000; warmup = 2000; thin = 1;
## total post-warmup draws = 8000
##
## Group-Level Effects:
## ~Subj (Number of levels: 71)
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## sd(Intercept) 2.17 1.44 0.09 5.34 1.00 727
## sd(TestSpeakerS4) 2.11 1.68 0.06 6.38 1.00 863
## cor(Intercept,TestSpeakerS4) -0.37 0.55 -0.98 0.87 1.00 3268
## Tail_ESS
## sd(Intercept) 935
## sd(TestSpeakerS4) 790
## cor(Intercept,TestSpeakerS4) 4221
##
## Population-Level Effects:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS
## Intercept 1.64 2.75 -3.71 6.97 1.00 7818
## TestSpeakerS4 3.15 1.57 0.07 6.29 1.00 5838
## Groupunfam 0.45 1.58 -2.64 3.50 1.00 6366
## mumDist -0.34 0.66 -1.64 0.95 1.00 8507
## nrSpeakersDaily 0.87 0.57 -0.24 1.99 1.00 9522
## sleepStateawake -3.60 2.59 -8.66 1.54 1.00 8790
## age -0.13 0.57 -1.26 0.99 1.00 9521
## TestSpeakerS4:Groupunfam -3.80 2.10 -7.87 0.31 1.00 5686
## Tail_ESS
## Intercept 5742
## TestSpeakerS4 5984
## Groupunfam 6082
## mumDist 5897
## nrSpeakersDaily 6151
## sleepStateawake 6178
## age 6160
## TestSpeakerS4:Groupunfam 5966
##
## Family Specific Parameters:
## Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
## sigma 5.89 0.62 4.38 6.93 1.00 930 633
##
## Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
## and Tail_ESS are effective sample size measures, and Rhat is the potential
## scale reduction factor on split chains (at convergence, Rhat = 1).
There were 1775 divergent transitions after warmup. However, the model seems to have converged, since the Rhat and ESS values are good enough (cf. https://mc-stan.org/rstan/reference/Rhat.html).
We will plot the traces, to check model convergence:
plot(model_acq_earlyTW, ask = FALSE)
Here we see that the traces all look like nice hairy caterpillars,
meaning that the chains have mixed (e.g., https://nicholasrjenkins.science/post/bda_in_r/bayes_with_brms/).
The posterior distributions also look good.
Now we check whether the model is able to retrieve the underlying data. y is the observed data, so the data that we inputted, and y’ is the simulated data from the posterior predictive distribution.
pp_check(model_acq_earlyTW, ndraws=50)
This looks good: the data overlap.
We now check the posterior samples of the posterior predictive distribution, for the estimates of S1_fam, S2_fam, S1_unfam, and S2_unfam.
data_acq_earlyTW <-
data_acq_earlyTW %>%
unite( # unite columns for posterior predictive checks
# unites the two columns TestSpeaker and Group Because brms made a posterior distribution
# with TestSpeaker_Group, because it looks at an interaction.
"TestSpeaker_Group",
c("TestSpeaker", "Group"),
sep = "_",
remove = FALSE,
na.rm = FALSE
) %>%
select(Subj, TestSpeaker_Group, TestSpeaker, Group, MMR)
posterior_predict_model_acq_earlyTW <-
model_acq_earlyTW %>%
posterior_predict(ndraws = 2000)
PPS_acq_earlyTW <-
posterior_predict_model_acq_earlyTW %>%
ppc_stat_grouped(
y = pull(data_acq_earlyTW, MMR),
group = pull(data_acq_earlyTW, TestSpeaker_Group),
stat = "mean"
) +
ggtitle("Posterior predictive samples")
PPS_acq_earlyTW
## `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
y is the mean of the data that we put in the model. Yrep is the
posterior predicted samples from our posterior distribution. We see that
our actual mean is in the middle of the predicted distribution, and that
the predicted distribution is a normal distribution, which is what we
expect.
In this exploratory test, we are interested in the effect of Group for the Novel Test Speaker (S4).
As above, we used the following code to compute the pMAP:
pMAP_Group_simple_acq_earlyTW <- model_acq_earlyTW %>%
emmeans(~ Group | TestSpeaker) %>%
pairs() %>%
p_map() %>%
mutate("p < .05" = ifelse(p_MAP < .05, "*", ""))
As above, we used the following code to compute the Bayes Factors:
model_acq_earlyTW_prior <- unupdate(model_acq_earlyTW)
Group_simple_pairwise_prior_acq_earlyTW <- model_acq_earlyTW_prior %>%
emmeans(~ Group | TestSpeaker) %>%
pairs()
Group_simple_pairwise_acq_earlyTW <- model_acq_earlyTW %>%
emmeans(~ Group | TestSpeaker) %>%
pairs()
BF_Group_simple_acq_earlyTW <- Group_simple_pairwise_acq_earlyTW %>%
bf_parameters(prior = Group_simple_pairwise_prior_acq_earlyTW) %>%
arrange(log_BF) %>%
add_column("interpretation" = interpret_bf(
.$log_BF,
rules = "raftery1995",
log = TRUE,
include_value = TRUE,
protect_ratio = TRUE,
exact = TRUE
), .after = "log_BF")
# Merge and save results ---------------------------------------------------------
pMAP_BF_Group_earlyTW <- full_join(pMAP_Group_simple_acq_earlyTW, BF_Group_simple_acq_earlyTW, by = "Parameter") %>%
as_tibble()
saveRDS(pMAP_BF_Group, file = here("data", "hypothesis_testing", "pMAP_BF_acq_earlyTW.rds"))
pMAP_BF_Group_earlyTW <- readRDS(here("data", "hypothesis_testing", "pMAP_BF_acq_earlyTW.rds"))
pMAP_BF_Group_earlyTW
## # A tibble: 2 × 5
## Parameter p_MAP `p < .05` log_BF interpretation
## <chr> <dbl> <chr> <dbl> <effctsz_>
## 1 fam - unfam, S1 0.940 "" -1.77 positive evidence (BF = 1/5.87) again…
## 2 fam - unfam, S4 0.0840 "" 0.311 weak evidence (BF = 1.36) in favour of
As a last step, we will perform a sensitivity analysis for the BFs for the effect of Group for S4, to check in how far our BFs are dependent or our estimated effect size in our prior.
This is the code we used:
num_chains <- 4
num_iter <- 80000
num_warmup <- num_iter / 2
num_thin <- 1
# Run the model with different sds
prior_sd <- c(1, 5, 8, 10, 15, 20, 30, 40, 50)
#Run the models
for (i in 1:length(prior_sd)) {
psd <- prior_sd[i]
fit <- brm(MMR ~ 1 + TestSpeaker * Group +
mumDist +
nrSpeakersDaily +
sleepState +
age +
(1 + TestSpeaker | Subj),
data = data_acq_earlyTW,
prior = c(
set_prior("normal(3.5, 20)", class = "Intercept"),
set_prior(paste0("normal(0,", psd, ")"), class = "b"),
set_prior("normal(0, 20)", class = "sigma")
),
family = gaussian(),
control = list(
adapt_delta = .99,
max_treedepth = 15
),
iter = num_iter,
chains = num_chains,
warmup = num_warmup,
thin = num_thin,
cores = num_chains,
seed = project_seed,
save_pars = save_pars(all = TRUE),
file=here("data", "sensitivity_analysis", paste0("E5_earlyTW_sensAnal_BF_priorsd_", psd, ".rds"))
)
}
## Simple effect Group for speaker 4 -------------------------
BF <- c()
for (i in 1:length(prior_sd)) {
psd <- prior_sd[i]
fit <- readRDS(here("data", "sensitivity_analysis", paste0("E5_earlyTW_sensAnal_BF_priorsd_", psd, ".rds")))
fit_priors <- unupdate(fit)
m_prior <- fit_priors %>%
emmeans(~ Group | TestSpeaker) %>%
pairs()
m_post <- fit %>%
emmeans(~ Group | TestSpeaker) %>%
pairs()
BF_current <- bf_parameters(m_post, prior = m_prior) %>%
filter(Parameter == "fam - unfam, S4")
BF_current <- as.numeric(BF_current)
BF <- c(BF, BF_current)
}
res <- data.frame(prior_sd, BF, logBF = log10(BF))
save(res, file = here("data", "sensitivity_analysis", "E5_earlyTW_sensAnal_BF_simpleGroup-S4.rda"))
Let’s have a look at the plot:
load(here("data", "sensitivity_analysis","E5_earlyTW_sensAnal_BF_simpleGroup-S4.rda"))
res_groupS2_acq = res
prior_sd <- c(1, 5, 8, 10, 15, 20, 30, 40, 50)
breaks <- c(1 / 100, 1 / 50, 1 / 20, 1 / 10,1 / 5, 1 / 3,1, 3, 5, 10, 20, 50, 100)
p = ggplot(res, aes(x = prior_sd, y = BF)) +
geom_point(size = 2) +
geom_line() +
geom_hline(yintercept = 1, linetype = "dashed") +
scale_x_continuous("Normal prior width (SD)\n") +
scale_y_log10(expression("BF"[10]), breaks = breaks, labels = MASS::fractions(breaks)) +
coord_cartesian(ylim = c(1 / 100, 100), xlim = c(0, tail(prior_sd, n = 1))) +
annotate("text", x = 40, y = 30, label = expression("Evidence in favor of H"[1]), size = 5) +
annotate("text", x = 40, y = 1 / 30, label = expression("Evidence in favor of H"[0]), size = 5) +
theme(axis.text.y = element_text(size = 8)) +
theme_apa()
p
## Warning in is.na(x): is.na() applied to non-(list or vector) of type
## 'expression'
## Warning in is.na(x): is.na() applied to non-(list or vector) of type
## 'expression'
As said above, in a sensitivity analysis, we expect that if we assume a very low effect size (SD is low), we have support for H1, but if we assume a very big effect size (SD is very high) we find support for the H0. In this exploratory analysis, we see that the effect is much more dependent on the prior on the effect size than the effect of S4 in the preregistered time window.
sessioninfo::session_info()
## ─ Session info ───────────────────────────────────────────────────────────────
## setting value
## version R version 4.2.2 (2022-10-31)
## os macOS Big Sur ... 10.16
## system x86_64, darwin17.0
## ui X11
## language (EN)
## collate en_US.UTF-8
## ctype en_US.UTF-8
## tz Europe/Berlin
## date 2025-11-20
## pandoc 2.19.2 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
##
## ─ Packages ───────────────────────────────────────────────────────────────────
## package * version date (UTC) lib source
## abind 1.4-5 2016-07-21 [2] CRAN (R 4.2.0)
## askpass 1.1 2019-01-13 [2] CRAN (R 4.2.0)
## assertthat 0.2.1 2019-03-21 [2] CRAN (R 4.2.0)
## backports 1.4.1 2021-12-13 [2] CRAN (R 4.2.0)
## base64enc 0.1-3 2015-07-28 [2] CRAN (R 4.2.0)
## bayesplot * 1.10.0 2022-11-16 [2] CRAN (R 4.2.0)
## bayestestR * 0.13.1 2023-04-07 [2] CRAN (R 4.2.0)
## boot 1.3-28.1 2022-11-22 [2] CRAN (R 4.2.0)
## bridgesampling 1.1-2 2021-04-16 [2] CRAN (R 4.2.0)
## brms * 2.18.0 2022-09-19 [2] CRAN (R 4.2.0)
## Brobdingnag 1.2-9 2022-10-19 [2] CRAN (R 4.2.0)
## broom 1.0.1 2022-08-29 [2] CRAN (R 4.2.0)
## bslib 0.4.1 2022-11-02 [2] CRAN (R 4.2.0)
## cachem 1.0.6 2021-08-19 [2] CRAN (R 4.2.0)
## callr 3.7.3 2022-11-02 [2] CRAN (R 4.2.0)
## cellranger 1.1.0 2016-07-27 [2] CRAN (R 4.2.0)
## checkmate 2.1.0 2022-04-21 [2] CRAN (R 4.2.0)
## cli 3.6.2 2023-12-11 [2] CRAN (R 4.2.0)
## coda 0.19-4 2020-09-30 [2] CRAN (R 4.2.0)
## codetools 0.2-18 2020-11-04 [2] CRAN (R 4.2.2)
## colorspace 2.0-3 2022-02-21 [2] CRAN (R 4.2.0)
## colourpicker 1.2.0 2022-10-28 [2] CRAN (R 4.2.0)
## correlation * 0.8.4 2023-04-06 [2] CRAN (R 4.2.0)
## corrplot * 0.92 2021-11-18 [2] CRAN (R 4.2.0)
## crayon 1.5.2 2022-09-29 [2] CRAN (R 4.2.0)
## credentials 1.3.2 2021-11-29 [2] CRAN (R 4.2.0)
## crosstalk 1.2.0 2021-11-04 [2] CRAN (R 4.2.0)
## datawizard * 1.0.0 2025-01-10 [1] CRAN (R 4.2.2)
## DBI 1.1.3 2022-06-18 [2] CRAN (R 4.2.0)
## dbplyr 2.2.1 2022-06-27 [2] CRAN (R 4.2.0)
## digest 0.6.30 2022-10-18 [2] CRAN (R 4.2.0)
## distributional 0.3.1 2022-09-02 [2] CRAN (R 4.2.0)
## dplyr * 1.0.10 2022-09-01 [2] CRAN (R 4.2.0)
## DT 0.26 2022-10-19 [2] CRAN (R 4.2.0)
## dygraphs 1.1.1.6 2018-07-11 [2] CRAN (R 4.2.0)
## easystats * 0.7.0 2023-11-05 [2] CRAN (R 4.2.2)
## effectsize * 0.8.6 2023-09-14 [2] CRAN (R 4.2.0)
## ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.2.0)
## emmeans * 1.8.3 2022-12-06 [2] CRAN (R 4.2.0)
## estimability 1.4.1 2022-08-05 [2] CRAN (R 4.2.0)
## evaluate 0.18 2022-11-07 [2] CRAN (R 4.2.0)
## fansi 1.0.3 2022-03-24 [2] CRAN (R 4.2.0)
## farver 2.1.1 2022-07-06 [2] CRAN (R 4.2.0)
## fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.2.0)
## forcats * 0.5.2 2022-08-19 [2] CRAN (R 4.2.0)
## fs 1.5.2 2021-12-08 [2] CRAN (R 4.2.0)
## gamm4 0.2-6 2020-04-03 [2] CRAN (R 4.2.0)
## gargle 1.2.1 2022-09-08 [2] CRAN (R 4.2.0)
## generics 0.1.3 2022-07-05 [2] CRAN (R 4.2.0)
## gert 1.9.2 2022-12-05 [2] CRAN (R 4.2.0)
## GGally 2.2.0 2023-11-22 [2] CRAN (R 4.2.0)
## ggmcmc * 1.5.1.1 2021-02-10 [2] CRAN (R 4.2.0)
## ggplot2 * 3.4.4 2023-10-12 [2] CRAN (R 4.2.0)
## ggstats 0.5.1 2023-11-21 [2] CRAN (R 4.2.0)
## gh 1.3.1 2022-09-08 [2] CRAN (R 4.2.0)
## glue 1.6.2 2022-02-24 [2] CRAN (R 4.2.0)
## googledrive 2.0.0 2021-07-08 [2] CRAN (R 4.2.0)
## googlesheets4 1.0.1 2022-08-13 [2] CRAN (R 4.2.0)
## gridExtra 2.3 2017-09-09 [2] CRAN (R 4.2.0)
## gtable 0.3.1 2022-09-01 [2] CRAN (R 4.2.0)
## gtools 3.9.4 2022-11-27 [2] CRAN (R 4.2.0)
## haven 2.5.1 2022-08-22 [2] CRAN (R 4.2.0)
## here * 1.0.1 2020-12-13 [2] CRAN (R 4.2.0)
## highr 0.11 2024-05-26 [1] CRAN (R 4.2.2)
## hms 1.1.2 2022-08-19 [2] CRAN (R 4.2.0)
## htmltools 0.5.4 2022-12-07 [2] CRAN (R 4.2.0)
## htmlwidgets 1.5.4 2021-09-08 [2] CRAN (R 4.2.0)
## httpuv 1.6.6 2022-09-08 [2] CRAN (R 4.2.0)
## httr 1.4.4 2022-08-17 [2] CRAN (R 4.2.0)
## igraph 1.3.5 2022-09-22 [2] CRAN (R 4.2.0)
## inline 0.3.19 2021-05-31 [2] CRAN (R 4.2.0)
## insight * 1.0.2 2025-02-06 [1] CRAN (R 4.2.2)
## jquerylib 0.1.4 2021-04-26 [2] CRAN (R 4.2.0)
## jsonlite 1.8.4 2022-12-06 [2] CRAN (R 4.2.0)
## knitr 1.41 2022-11-18 [2] CRAN (R 4.2.0)
## labeling 0.4.2 2020-10-20 [2] CRAN (R 4.2.0)
## later 1.3.0 2021-08-18 [2] CRAN (R 4.2.0)
## lattice 0.20-45 2021-09-22 [2] CRAN (R 4.2.2)
## lifecycle 1.0.3 2022-10-07 [2] CRAN (R 4.2.0)
## lme4 1.1-31 2022-11-01 [2] CRAN (R 4.2.0)
## logspline * 2.1.21 2023-10-26 [2] CRAN (R 4.2.0)
## loo 2.5.1 2022-03-24 [2] CRAN (R 4.2.0)
## lubridate 1.9.0 2022-11-06 [2] CRAN (R 4.2.0)
## magrittr 2.0.3 2022-03-30 [2] CRAN (R 4.2.0)
## markdown 1.4 2022-11-16 [2] CRAN (R 4.2.0)
## MASS 7.3-58.1 2022-08-03 [2] CRAN (R 4.2.2)
## Matrix 1.5-3 2022-11-11 [2] CRAN (R 4.2.0)
## matrixStats 0.63.0 2022-11-18 [2] CRAN (R 4.2.0)
## mgcv 1.8-41 2022-10-21 [2] CRAN (R 4.2.2)
## mime 0.12 2021-09-28 [2] CRAN (R 4.2.0)
## miniUI 0.1.1.1 2018-05-18 [2] CRAN (R 4.2.0)
## minqa 1.2.5 2022-10-19 [2] CRAN (R 4.2.0)
## modelbased * 0.8.6 2023-01-13 [2] CRAN (R 4.2.0)
## modelr 0.1.10 2022-11-11 [2] CRAN (R 4.2.0)
## multcomp 1.4-20 2022-08-07 [2] CRAN (R 4.2.0)
## munsell 0.5.0 2018-06-12 [2] CRAN (R 4.2.0)
## mvtnorm 1.1-3 2021-10-08 [2] CRAN (R 4.2.0)
## nlme 3.1-160 2022-10-10 [2] CRAN (R 4.2.2)
## nloptr 2.0.3 2022-05-26 [2] CRAN (R 4.2.0)
## openssl 2.0.5 2022-12-06 [2] CRAN (R 4.2.0)
## papaja * 0.1.1 2022-07-05 [2] CRAN (R 4.2.0)
## parameters * 0.21.4 2024-02-04 [2] CRAN (R 4.2.2)
## performance * 0.10.8 2023-10-30 [2] CRAN (R 4.2.0)
## pillar 1.8.1 2022-08-19 [2] CRAN (R 4.2.0)
## pkgbuild 1.4.0 2022-11-27 [2] CRAN (R 4.2.0)
## pkgconfig 2.0.3 2019-09-22 [2] CRAN (R 4.2.0)
## plyr 1.8.8 2022-11-11 [2] CRAN (R 4.2.0)
## posterior 1.3.1 2022-09-06 [2] CRAN (R 4.2.0)
## prereg 0.6.0 2022-01-20 [2] CRAN (R 4.2.0)
## prettyunits 1.1.1 2020-01-24 [2] CRAN (R 4.2.0)
## processx 3.8.0 2022-10-26 [2] CRAN (R 4.2.0)
## projpred 2.2.2 2022-11-09 [2] CRAN (R 4.2.0)
## promises 1.2.0.1 2021-02-11 [2] CRAN (R 4.2.0)
## ps 1.7.2 2022-10-26 [2] CRAN (R 4.2.0)
## purrr * 1.0.2 2023-08-10 [2] CRAN (R 4.2.0)
## R6 2.5.1 2021-08-19 [2] CRAN (R 4.2.0)
## ranger 0.14.1 2022-06-18 [2] CRAN (R 4.2.0)
## RColorBrewer * 1.1-3 2022-04-03 [2] CRAN (R 4.2.0)
## Rcpp * 1.0.9 2022-07-08 [2] CRAN (R 4.2.0)
## RcppParallel 5.1.5 2022-01-05 [2] CRAN (R 4.2.0)
## readr * 2.1.3 2022-10-01 [2] CRAN (R 4.2.0)
## readxl 1.4.1 2022-08-17 [2] CRAN (R 4.2.0)
## renv 0.16.0 2022-09-29 [2] CRAN (R 4.2.0)
## report * 0.5.8 2023-12-07 [2] CRAN (R 4.2.2)
## reprex 2.0.2 2022-08-17 [2] CRAN (R 4.2.0)
## reshape2 1.4.4 2020-04-09 [2] CRAN (R 4.2.0)
## rlang 1.1.2 2023-11-04 [2] CRAN (R 4.2.0)
## rmarkdown 2.18 2022-11-09 [2] CRAN (R 4.2.0)
## rprojroot 2.0.3 2022-04-02 [2] CRAN (R 4.2.0)
## rstan 2.21.7 2022-09-08 [2] CRAN (R 4.2.0)
## rstantools 2.2.0 2022-04-08 [2] CRAN (R 4.2.0)
## rstudioapi 0.14 2022-08-22 [2] CRAN (R 4.2.0)
## rticles 0.24 2022-08-25 [2] CRAN (R 4.2.0)
## rvest 1.0.3 2022-08-19 [2] CRAN (R 4.2.0)
## sandwich 3.0-2 2022-06-15 [2] CRAN (R 4.2.0)
## sass 0.4.4 2022-11-24 [2] CRAN (R 4.2.0)
## scales 1.2.1 2022-08-20 [2] CRAN (R 4.2.0)
## see * 0.8.1 2023-11-03 [2] CRAN (R 4.2.2)
## sessioninfo * 1.2.2 2021-12-06 [1] CRAN (R 4.2.0)
## shiny 1.7.3 2022-10-25 [2] CRAN (R 4.2.0)
## shinyjs 2.1.0 2021-12-23 [2] CRAN (R 4.2.0)
## shinystan 2.6.0 2022-03-03 [2] CRAN (R 4.2.0)
## shinythemes 1.2.0 2021-01-25 [2] CRAN (R 4.2.0)
## StanHeaders 2.21.0-7 2020-12-17 [2] CRAN (R 4.2.0)
## stringi 1.7.8 2022-07-11 [2] CRAN (R 4.2.0)
## stringr * 1.5.0 2022-12-02 [2] CRAN (R 4.2.0)
## survival 3.4-0 2022-08-09 [2] CRAN (R 4.2.2)
## sys 3.4.1 2022-10-18 [2] CRAN (R 4.2.0)
## tensorA 0.36.2 2020-11-19 [2] CRAN (R 4.2.0)
## TH.data 1.1-1 2022-04-26 [2] CRAN (R 4.2.0)
## threejs 0.3.3 2020-01-21 [2] CRAN (R 4.2.0)
## tibble * 3.1.8 2022-07-22 [2] CRAN (R 4.2.0)
## tidyr * 1.3.0 2023-01-24 [2] CRAN (R 4.2.0)
## tidyselect 1.2.0 2022-10-10 [2] CRAN (R 4.2.0)
## tidyverse * 1.3.2 2022-07-18 [2] CRAN (R 4.2.0)
## timechange 0.1.1 2022-11-04 [2] CRAN (R 4.2.0)
## tinylabels * 0.2.3 2022-02-06 [2] CRAN (R 4.2.0)
## tinytex 0.43 2022-12-13 [2] CRAN (R 4.2.2)
## tzdb 0.3.0 2022-03-28 [2] CRAN (R 4.2.0)
## usethis 2.1.6 2022-05-25 [2] CRAN (R 4.2.0)
## utf8 1.2.2 2021-07-24 [2] CRAN (R 4.2.0)
## vctrs 0.6.5 2023-12-01 [2] CRAN (R 4.2.0)
## withr 2.5.0 2022-03-03 [2] CRAN (R 4.2.0)
## worcs * 0.1.10 2022-07-20 [2] CRAN (R 4.2.0)
## xfun 0.35 2022-11-16 [2] CRAN (R 4.2.0)
## xml2 1.3.3 2021-11-30 [2] CRAN (R 4.2.0)
## xtable 1.8-4 2019-04-21 [2] CRAN (R 4.2.0)
## xts 0.12.2 2022-10-16 [2] CRAN (R 4.2.0)
## yaml 2.3.6 2022-10-18 [2] CRAN (R 4.2.0)
## zoo 1.8-11 2022-09-17 [2] CRAN (R 4.2.0)
##
## [1] /Users/giselagovaart/Library/R/x86_64/4.2/library
## [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library
##
## ──────────────────────────────────────────────────────────────────────────────
cite_packages()
## - Aust F, Barth M (2022). _papaja: Prepare reproducible APA journal articles with R Markdown_. R package version 0.1.1, <https://github.com/crsh/papaja>.
## - Barth M (2022). _tinylabels: Lightweight Variable Labels_. R package version 0.2.3, <https://cran.r-project.org/package=tinylabels>.
## - Ben-Shachar MS, Lüdecke D, Makowski D (2020). "effectsize: Estimation of Effect Size Indices and Standardized Parameters." _Journal of Open Source Software_, *5*(56), 2815. doi:10.21105/joss.02815 <https://doi.org/10.21105/joss.02815>, <https://doi.org/10.21105/joss.02815>.
## - Bürkner P (2017). "brms: An R Package for Bayesian Multilevel Models Using Stan." _Journal of Statistical Software_, *80*(1), 1-28. doi:10.18637/jss.v080.i01 <https://doi.org/10.18637/jss.v080.i01>. Bürkner P (2018). "Advanced Bayesian Multilevel Modeling with the R Package brms." _The R Journal_, *10*(1), 395-411. doi:10.32614/RJ-2018-017 <https://doi.org/10.32614/RJ-2018-017>. Bürkner P (2021). "Bayesian Item Response Modeling in R with brms and Stan." _Journal of Statistical Software_, *100*(5), 1-54. doi:10.18637/jss.v100.i05 <https://doi.org/10.18637/jss.v100.i05>.
## - Eddelbuettel D, François R (2011). "Rcpp: Seamless R and C++ Integration." _Journal of Statistical Software_, *40*(8), 1-18. doi:10.18637/jss.v040.i08 <https://doi.org/10.18637/jss.v040.i08>. Eddelbuettel D (2013). _Seamless R and C++ Integration with Rcpp_. Springer, New York. doi:10.1007/978-1-4614-6868-4 <https://doi.org/10.1007/978-1-4614-6868-4>, ISBN 978-1-4614-6867-7. Eddelbuettel D, Balamuta JJ (2018). "Extending extitR with extitC++: A Brief Introduction to extitRcpp." _The American Statistician_, *72*(1), 28-36. doi:10.1080/00031305.2017.1375990 <https://doi.org/10.1080/00031305.2017.1375990>.
## - Fernández-i-Mar\'in X (2016). "ggmcmc: Analysis of MCMC Samples and Bayesian Inference." _Journal of Statistical Software_, *70*(9), 1-20. doi:10.18637/jss.v070.i09 <https://doi.org/10.18637/jss.v070.i09>.
## - Gabry J, Mahr T (2022). "bayesplot: Plotting for Bayesian Models." R package version 1.10.0, <https://mc-stan.org/bayesplot/>. Gabry J, Simpson D, Vehtari A, Betancourt M, Gelman A (2019). "Visualization in Bayesian workflow." _J. R. Stat. Soc. A_, *182*, 389-402. doi:10.1111/rssa.12378 <https://doi.org/10.1111/rssa.12378>.
## - Kooperberg C (2023). _logspline: Routines for Logspline Density Estimation_. R package version 2.1.21, <https://CRAN.R-project.org/package=logspline>.
## - Lenth R (2022). _emmeans: Estimated Marginal Means, aka Least-Squares Means_. R package version 1.8.3, <https://CRAN.R-project.org/package=emmeans>.
## - Lüdecke D, Ben-Shachar M, Patil I, Makowski D (2020). "Extracting, Computing and Exploring the Parameters of Statistical Models using R." _Journal of Open Source Software_, *5*(53), 2445. doi:10.21105/joss.02445 <https://doi.org/10.21105/joss.02445>.
## - Lüdecke D, Ben-Shachar M, Patil I, Waggoner P, Makowski D (2021). "performance: An R Package for Assessment, Comparison and Testing of Statistical Models." _Journal of Open Source Software_, *6*(60), 3139. doi:10.21105/joss.03139 <https://doi.org/10.21105/joss.03139>.
## - Lüdecke D, Ben-Shachar M, Patil I, Wiernik B, Bacher E, Thériault R, Makowski D (2022). "easystats: Framework for Easy Statistical Modeling, Visualization, and Reporting." _CRAN_. R package, <https://easystats.github.io/easystats/>.
## - Lüdecke D, Patil I, Ben-Shachar M, Wiernik B, Waggoner P, Makowski D (2021). "see: An R Package for Visualizing Statistical Models." _Journal of Open Source Software_, *6*(64), 3393. doi:10.21105/joss.03393 <https://doi.org/10.21105/joss.03393>.
## - Lüdecke D, Waggoner P, Makowski D (2019). "insight: A Unified Interface to Access Information from Model Objects in R." _Journal of Open Source Software_, *4*(38), 1412. doi:10.21105/joss.01412 <https://doi.org/10.21105/joss.01412>.
## - Makowski D, Ben-Shachar M, Lüdecke D (2019). "bayestestR: Describing Effects and their Uncertainty, Existence and Significance within the Bayesian Framework." _Journal of Open Source Software_, *4*(40), 1541. doi:10.21105/joss.01541 <https://doi.org/10.21105/joss.01541>, <https://joss.theoj.org/papers/10.21105/joss.01541>.
## - Makowski D, Ben-Shachar M, Patil I, Lüdecke D (2020). "Estimation of Model-Based Predictions, Contrasts and Means." _CRAN_. <https://github.com/easystats/modelbased>.
## - Makowski D, Lüdecke D, Patil I, Thériault R, Ben-Shachar M, Wiernik B (2023). "Automated Results Reporting as a Practical Tool to Improve Reproducibility and Methodological Best Practices Adoption." _CRAN_. <https://easystats.github.io/report/>.
## - Makowski D, Wiernik B, Patil I, Lüdecke D, Ben-Shachar M (2022). "correlation: Methods for Correlation Analysis." Version 0.8.3, <https://CRAN.R-project.org/package=correlation>. Makowski D, Ben-Shachar M, Patil I, Lüdecke D (2020). "Methods and Algorithms for Correlation Analysis in R." _Journal of Open Source Software_, *5*(51), 2306. doi:10.21105/joss.02306 <https://doi.org/10.21105/joss.02306>, <https://joss.theoj.org/papers/10.21105/joss.02306>.
## - Müller K (2020). _here: A Simpler Way to Find Your Files_. R package version 1.0.1, <https://CRAN.R-project.org/package=here>.
## - Müller K, Wickham H (2022). _tibble: Simple Data Frames_. R package version 3.1.8, <https://CRAN.R-project.org/package=tibble>.
## - Neuwirth E (2022). _RColorBrewer: ColorBrewer Palettes_. R package version 1.1-3, <https://CRAN.R-project.org/package=RColorBrewer>.
## - Patil I, Makowski D, Ben-Shachar M, Wiernik B, Bacher E, Lüdecke D (2022). "datawizard: An R Package for Easy Data Preparation and Statistical Transformations." _Journal of Open Source Software_, *7*(78), 4684. doi:10.21105/joss.04684 <https://doi.org/10.21105/joss.04684>.
## - R Core Team (2022). _R: A Language and Environment for Statistical Computing_. R Foundation for Statistical Computing, Vienna, Austria. <https://www.R-project.org/>.
## - Van Lissa CJ, Peikert A, Brandmaier AM (2022). _worcs: Workflow for Open Reproducible Code in Science_. R package version 0.1.10, <https://CRAN.R-project.org/package=worcs>.
## - Wei T, Simko V (2021). _R package 'corrplot': Visualization of a Correlation Matrix_. (Version 0.92), <https://github.com/taiyun/corrplot>.
## - Wickham H (2016). _ggplot2: Elegant Graphics for Data Analysis_. Springer-Verlag New York. ISBN 978-3-319-24277-4, <https://ggplot2.tidyverse.org>.
## - Wickham H (2022). _forcats: Tools for Working with Categorical Variables (Factors)_. R package version 0.5.2, <https://CRAN.R-project.org/package=forcats>.
## - Wickham H (2022). _stringr: Simple, Consistent Wrappers for Common String Operations_. R package version 1.5.0, <https://CRAN.R-project.org/package=stringr>.
## - Wickham H, Averick M, Bryan J, Chang W, McGowan LD, François R, Grolemund G, Hayes A, Henry L, Hester J, Kuhn M, Pedersen TL, Miller E, Bache SM, Müller K, Ooms J, Robinson D, Seidel DP, Spinu V, Takahashi K, Vaughan D, Wilke C, Woo K, Yutani H (2019). "Welcome to the tidyverse." _Journal of Open Source Software_, *4*(43), 1686. doi:10.21105/joss.01686 <https://doi.org/10.21105/joss.01686>.
## - Wickham H, Chang W, Flight R, Müller K, Hester J (2021). _sessioninfo: R Session Information_. R package version 1.2.2, <https://CRAN.R-project.org/package=sessioninfo>.
## - Wickham H, François R, Henry L, Müller K (2022). _dplyr: A Grammar of Data Manipulation_. R package version 1.0.10, <https://CRAN.R-project.org/package=dplyr>.
## - Wickham H, Henry L (2023). _purrr: Functional Programming Tools_. R package version 1.0.2, <https://CRAN.R-project.org/package=purrr>.
## - Wickham H, Hester J, Bryan J (2022). _readr: Read Rectangular Text Data_. R package version 2.1.3, <https://CRAN.R-project.org/package=readr>.
## - Wickham H, Vaughan D, Girlich M (2023). _tidyr: Tidy Messy Data_. R package version 1.3.0, <https://CRAN.R-project.org/package=tidyr>.